Learn R Programming

pbdMPI (version 0.5-2)

global balanc: Global Balance Functions

Description

These functions are global balance methods for gbd data.frame (or matrix) distributed in row blocks.

Usage

comm.balance.info(X.gbd, balance.method = .pbd_env$SPMD.IO$balance.method[1],
                  comm = .pbd_env$SPMD.CT$comm)
comm.load.balance(X.gbd, bal.info = NULL,
                  balance.method = .pbd_env$SPMD.IO$balance.method[1],
                  comm = .pbd_env$SPMD.CT$comm)
comm.unload.balance(new.X.gbd, bal.info, comm = .pbd_env$SPMD.CT$comm)

Value

comm.balance.info() returns a list containing balance information based on the input X.gbd and balance.method.

comm.load.balance() returns a new gbd data.frame (or

matrix).

comm.unload.balance() also returns the new gbd data.frame back to the original X.gbd.

Arguments

X.gbd

a gbd data.frame (or matrix).

balance.method

a balance method.

bal.info

a balance information returned from comm.balance.info(). If NULL, then this will be generated inside comm.load.balance().

new.X.gbd

a new gbd of X.gbd (may be generated from comm.load.balance().

comm

a communicator number.

Author

Wei-Chen Chen wccsnow@gmail.com, George Ostrouchov, Drew Schmidt, Pragneshkumar Patel, and Hao Yu.

Details

A typical use is to balance an input dataset X.gbd from comm.read.table(). Since by default, a two dimension data.frame is distributed in row blocks, but each processor (rank) may not (or closely) have the same number of rows. These functions redistribute the data.frame (and maybe matrix) according to the specified way in bal.info.

Currently, there are three balance methods are supported, block (uniform distributed but favor higher ranks), block0 (as block but favor lower ranks), and block.cyclic (as block cyclic with one big block in one cycle).

References

Programming with Big Data in R Website: https://pbdr.org/

See Also

comm.read.table(), comm.write.table(), and comm.as.gbd().

Examples

Run this code
if (FALSE) {
### Save code in a file "demo.r" and run with 2 processors by
### SHELL> mpiexec -np 2 Rscript demo.r

spmd.code <- "
### Initialize
suppressMessages(library(pbdMPI, quietly = TRUE))

### Get two gbd row-block data.frame.
da.block <- iris[get.jid(nrow(iris), method = \"block\"),]
da.block0 <- iris[get.jid(nrow(iris), method = \"block0\"),]

### Load balance one and unload it.
bal.info <- comm.balance.info(da.block0)
da.new <- comm.load.balance(da.block0)
da.org <- comm.unload.balance(da.new, bal.info)

### Check if all are equal.
comm.print(c(sum(da.new != da.block), sum(da.org != da.block0)),
           all.rank = TRUE)

### Finish.
finalize()
"
# execmpi(spmd.code, nranks = 2L)
}

Run the code above in your browser using DataLab